AITopics | maximin strategy

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance.

artificial intelligence, machine learning, minimax strategy, (18 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

Efficient Minimax Strategies for Square Loss Games

Neural Information Processing SystemsMar-13-2024, 10:39:13 GMT

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance.

brier game, maximin strategy, minimax strategy, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

Safe Equilibrium

Ganzfried, Sam

arXiv.org Artificial IntelligenceJan-11-2022

In designing a strategy for a multiagent interaction an agent must balance between the assumption that opponents are behaving rationally with the risks that may occur if opponents behave irrationally. Most classic game-theoretic solution concepts, such as Nash equilibrium (NE), assume that all players are behaving rationally (and that this fact is common knowledge). On the other hand, a maximin strategy plays a strategy that has the largest worst-case guaranteed expected payoff; this limits the potential downside against a worstcase and potentially irrational opponent, but can also cause us to achieve significantly lower payoff against rational opponents. In two-player zero-sum games, Nash equilibrium and maximin strategies are equivalent (by the minimax theorem), and these two goals are completely aligned. But in non-zero-sum games and games with more than two players, this is not the case. In these games we can potentially obtain arbitrarily low payoff by following a Nash equilibrium strategy, but if we follow a maximin strategy will likely be playing far too conservatively. While the assumption that opponents are exhibiting a degree of rationality, as well as the desire to limit worst-case performance in the case of irrational opponents, are both desirable, neither the Nash equilibrium nor maximin solution concept is definitively compelling on its own. We propose a new solution concept that balances between these two extremes.

algorithm, equilibrium, opponent, (17 more...)

arXiv.org Artificial Intelligence

2201.04266

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Efficient Minimax Strategies for Square Loss Games

Koolen, Wouter M., Malek, Alan, Bartlett, Peter L.

Neural Information Processing SystemsDec-31-2014

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the $\ell_2$ ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.

artificial intelligence, machine learning, minimax strategy, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States > California (0.14)

Industry: Leisure & Entertainment > Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Add feedback

New Criteria and a New Algorithm for Learning in Multi-Agent Systems

Powers, Rob, Shoham, Yoav

Neural Information Processing SystemsDec-31-2005

We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or maximin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

algorithm, criteria, opponent, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)

Industry: Leisure & Entertainment > Games (0.94)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

New Criteria and a New Algorithm for Learning in Multi-Agent Systems

Powers, Rob, Shoham, Yoav

Neural Information Processing SystemsDec-31-2005

We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or maximin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

algorithm, criteria, opponent, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)

Industry: Leisure & Entertainment > Games (0.94)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

New Criteria and a New Algorithm for Learning in Multi-Agent Systems

Powers, Rob, Shoham, Yoav

Neural Information Processing SystemsDec-31-2005

We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly inrepeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or maximin value),and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

algorithm, artificial intelligence, opponent, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Industry: Leisure & Entertainment > Games (0.94)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback